16 research outputs found

    Parametric UMAP embeddings for representation and semi-supervised learning

    Full text link
    UMAP is a non-parametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) Compute a graphical representation of a dataset (fuzzy simplicial complex), and (2) Through stochastic gradient descent, optimize a low-dimensional embedding of the graph. Here, we extend the second step of UMAP to a parametric optimization over neural network weights, learning a parametric relationship between data and embedding. We first demonstrate that Parametric UMAP performs comparably to its non-parametric counterpart while conferring the benefit of a learned parametric mapping (e.g. fast online embeddings for new data). We then explore UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semi-supervised learning by capturing structure in unlabeled data. Google Colab walkthrough: https://colab.research.google.com/drive/1WkXVZ5pnMrm17m0YgmtoNjM_XHdnE5Vp?usp=sharin

    A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations

    Get PDF
    © The Author(s), 2022. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Thomas, M., Jensen, F. H., Averly, B., Demartsev, V., Manser, M. B., Sainburg, T., Roch, M. A., & Strandburg-Peshkin, A. A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations. The Journal of Animal Ecology, 91(8), (2022): 1567– 1581, https://doi.org/10.1111/1365-2656.13754.1. Background: The manual detection, analysis and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighbourhood-based dimensionality reduction of spectrograms to produce a latent space representation of calls stands out for its conceptual simplicity and effectiveness. 2. Goal of the study/what was done: Using a dataset of manually annotated meerkat Suricata suricatta vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyse strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabelled calls. 3. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.This work was supported by HFSP Research Grant RGP0051/2019 to ASP, MBM and MAR, and funded by the Deutsche Forschungsgemeinschaft (DFG) under Germany's Excellence Strategy (EXC-2117-422037984). ASP received additional funding from the Gips-Schüle Stiftung, the Zukunftskolleg at the University of Konstanz and the Max-Planck-Institute of Animal Behaviour. VD was funded by the Minerva Stiftung and Alexander von Humboldt Foundation

    American postdoctoral salaries do not account for growing disparities in cost of living

    Full text link
    The National Institute of Health (NIH) sets postdoctoral (postdoc) trainee stipend levels that many American institutions and investigators use as a basis for postdoc salaries. Although salary standards are held constant across universities, the cost of living in those universities' cities and towns vary widely. Across non-postdoc jobs, more expensive cities pay workers higher wages that scale with an increased cost of living. This work investigates the extent to which postdoc wages account for cost-of-living differences. More than 27,000 postdoc salaries across all US universities are analyzed alongside measures of regional differences in cost of living. We find that postdoc salaries do not account for cost-of-living differences, in contrast with the broader labor market in the same cities and towns. Despite a modest increase in income in high cost of living areas, real (cost of living adjusted) postdoc salaries differ by 29% ($15k 2021 USD) between the least and most expensive areas. Cities that produce greater numbers of tenure-track faculty relative to students such as Boston, New York, and San Francisco are among the most impacted by this pay disparity. The postdoc pay gap is growing and is well-positioned to incur a greater financial burden on economically disadvantaged groups and contribute to faculty hiring disparities in women and racial minorities

    Long-range sequential dependencies precede complex syntactic production in language acquisition

    No full text
    To convey meaning, language relies on hierarchically organized, long-range relationships spanning words, phrases, sentences, and discourse. As the distances between elements in language sequences increase, the strength of the long range relationships between those elements decays following a power law. This power-law relationship has been attributed variously to long-range sequential organization present in language syntax, semantics, and discourse structure. However, non-linguistic behaviors in numerous phylogenetically distant species, ranging from humpback whale song to fruit fly motility, demonstrate similar long-range statistical dependencies. Therefore, we hypothesized that long-range statistical dependencies in speech may occur independently of linguistic structure. To test this hypothesis, we measured long-range dependencies in speech corpora from children (aged 6 months -- 12 years). We find that adult-like power-law statistical dependencies are present in human vocalizations prior to the production of complex linguistic structure. These linguistic structures cannot, therefore, be the sole cause of long-range statistical dependencies in language

    Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires.

    No full text
    Animals produce vocalizations that range in complexity from a single repeated call to hundreds of unique vocal elements patterned in sequences unfolding over hours. Characterizing complex vocalizations can require considerable effort and a deep intuition about each species' vocal behavior. Even with a great deal of experience, human characterizations of animal communication can be affected by human perceptual biases. We present a set of computational methods for projecting animal vocalizations into low dimensional latent representational spaces that are directly learned from the spectrograms of vocal signals. We apply these methods to diverse datasets from over 20 species, including humans, bats, songbirds, mice, cetaceans, and nonhuman primates. Latent projections uncover complex features of data in visually intuitive and quantifiable ways, enabling high-powered comparative analyses of vocal acoustics. We introduce methods for analyzing vocalizations as both discrete sequences and as continuous latent variables. Each method can be used to disentangle complex spectro-temporal structure and observe long-timescale organization in communication
    corecore